Cushing
AI-Driven Document Redaction in UK Public Authorities: Implementation Gaps, Regulatory Challenges, and the Human Oversight Imperative
Document redaction in public authorities faces critical challenges as traditional manual approaches struggle to balance growing transparency demands with increasingly stringent data protection requirements. This study investigates the implementation of AI-driven document redaction within UK public authorities through Freedom of Information (FOI) requests. While AI technologies offer potential solutions to redaction challenges, their actual implementation within public sector organizations remains underexplored. Based on responses from 44 public authorities across healthcare, government, and higher education sectors, this study reveals significant gaps between technological possibilities and organizational realities. Findings show highly limited AI adoption (only one authority reported using AI tools), widespread absence of formal redaction policies (50 percent reported "information not held"), and deficiencies in staff training. The study identifies three key barriers to effective AI implementation: poor record-keeping practices, lack of standardized redaction guidelines, and insufficient specialized training for human oversight. These findings highlight the need for a socio-technical approach that balances technological automation with meaningful human expertise. This research provides the first empirical assessment of AI redaction practices in UK public authorities and contributes evidence to support policymakers navigating the complex interplay between transparency obligations, data protection requirements, and emerging AI technologies in public administration.
- Europe > United Kingdom > Northern Ireland (0.04)
- North America > United States > Hawaii (0.04)
- Oceania > Australia (0.04)
- (13 more...)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- Education (1.00)
- (2 more...)
PARROT: Persuasion and Agreement Robustness Rating of Output Truth -- A Sycophancy Robustness Benchmark for LLMs
Çelebi, Yusuf, Ezerceli, Özay, Hussieni, Mahmoud El
This study presents PARROT (Persuasion and Agreement Robustness Rating of Output Truth), a robustness focused framework designed to measure the degradation in accuracy that occurs under social pressure exerted on users through authority and persuasion in large language models (LLMs) the phenomenon of sycophancy (excessive conformity). PARROT (i) isolates causal effects by comparing the neutral version of the same question with an authoritatively false version using a double-blind evaluation, (ii) quantifies confidence shifts toward the correct and imposed false responses using log-likelihood-based calibration tracking, and (iii) systematically classifies failure modes (e.g., robust correct, sycophantic agreement, reinforced error, stubborn error, self-correction, etc.) using an eight-state behavioral taxonomy. We evaluated 22 models using 1,302 MMLU-style multiple-choice questions across 13 domains and domain-specific authority templates. Findings show marked heterogeneity: advanced models (e.g., GPT-5, GPT-4.1, Claude Sonnet 4.5) exhibit low "follow rates" ($\leq 11\%$, GPT-5: 4\%) and minimal accuracy loss, while older/smaller models show severe epistemic collapse (GPT-4: 80\%, Qwen 2.5-1.5B: 94\%). The danger is not limited to response changes; weak models reduce confidence in the correct response while increasing confidence in the imposed incorrect response. While international law and global knowledge at the domain level exhibit high fragility, elementary mathematics is relatively resilient. Consequently, we argue that the goal of "resistance to overfitting pressure" should be addressed as a primary objective alongside accuracy, harm avoidance, and privacy for safe deployment in the real world.
- North America > United States > Oklahoma > Payne County > Cushing (0.04)
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
- Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
- (2 more...)
- Information Technology (0.68)
- Education (0.66)
- Law > International Law (0.55)
- Health & Medicine > Therapeutic Area > Endocrinology (0.46)
Asymptotic Theory and Phase Transitions for Variable Importance in Quantile Regression Forests
Nakamura, Tomoshige, Shiraishi, Hiroshi
Quantile Regression Forests (QRF) are widely used for non-parametric conditional quantile estimation, yet statistical inference for variable importance measures remains challenging due to the non-smoothness of the loss function and the complex bias-variance trade-off. In this paper, we develop a asymptotic theory for variable importance defined as the difference in pinball loss risks. We first establish the asymptotic normality of the QRF estimator by handling the non-differentiable pinball loss via Knight's identity. Second, we uncover a "phase transition" phenomenon governed by the subsampling rate $β$ (where $s \asymp n^β$). We prove that in the bias-dominated regime ($β\ge 1/2$), which corresponds to large subsample sizes typically favored in practice to maximize predictive accuracy, standard inference breaks down as the estimator converges to a deterministic bias constant rather than a zero-mean normal distribution. Finally, we derive the explicit analytic form of this asymptotic bias and discuss the theoretical feasibility of restoring valid inference via analytic bias correction. Our results highlight a fundamental trade-off between predictive performance and inferential validity, providing a theoretical foundation for understanding the intrinsic limitations of random forest inference in high-dimensional settings.
- North America > United States > Oklahoma > Payne County > Cushing (0.04)
- Asia > Pakistan (0.04)
Beyond Awareness: Investigating How AI and Psychological Factors Shape Human Self-Confidence Calibration
Cau, Federico Maria, Spano, Lucio Davide
Human-AI collaboration outcomes depend strongly on human self-confidence calibration, which drives reliance or resistance toward AI's suggestions. This work presents two studies examining whether calibration of self-confidence before decision tasks, low versus high levels of Need for Cognition (NFC), and Actively Open-Minded Thinking (AOT), leads to differences in decision accuracy, self-confidence appropriateness during the tasks, and metacognitive perceptions (global and affective). The first study presents strategies to identify well-calibrated users, also comparing decision accuracy and the appropriateness of self-confidence across NFC and AOT levels. The second study investigates the effects of calibrated self-confidence in AI-assisted decision-making (no AI, two-stage AI, and personalized AI), also considering different NFC and AOT levels. Our results show the importance of human self-confidence calibration and psychological traits when designing AI-assisted decision systems. We further propose design recommendations to address the challenge of calibrating self-confidence and supporting tailored, user-centric AI that accounts for individual traits.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > Austria > Vienna (0.14)
- North America > United States > New York > New York County > New York City (0.05)
- (36 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study > Negative Result (0.46)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Oklahoma > Payne County > Cushing (0.04)
- North America > United States > New York (0.04)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Diagnostic Medicine (1.00)
SurvBench: A Standardised Preprocessing Pipeline for Multi-Modal Electronic Health Record Survival Analysis
Mesinovic, Munib, Zhu, Tingting
Electronic health record (EHR) data present tremendous opportunities for advancing survival analysis through deep learning, yet reproducibility remains severely constrained by inconsistent preprocessing methodologies. We present SurvBench, a comprehensive, open-source preprocessing pipeline that transforms raw PhysioNet datasets into standardised, model-ready tensors for multi-modal survival analysis. SurvBench provides data loaders for three major critical care databases, MIMIC-IV, eICU, and MC-MED, supporting diverse modalities including time-series vitals, static demographics, ICD diagnosis codes, and radiology reports. The pipeline implements rigorous data quality controls, patient-level splitting to prevent data leakage, explicit missingness tracking, and standardised temporal aggregation. SurvBench handles both single-risk (e.g., in-hospital mortality) and competing-risks scenarios (e.g., multiple discharge outcomes). The outputs are compatible with pycox library packages and implementations of standard statistical and deep learning models. By providing reproducible, configuration-driven preprocessing with comprehensive documentation, SurvBench addresses the "preprocessing gap" that has hindered fair comparison of deep learning survival models, enabling researchers to focus on methodological innovation rather than data engineering.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > Oklahoma > Payne County > Cushing (0.04)
- Europe > United Kingdom > Wales (0.04)
- Asia > Middle East > Israel (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.67)
Forecasting Spoken Language Development in Children with Cochlear Implants Using Preimplantation MRI
Wang, Yanlin, Yuan, Di, Dettman, Shani, Choo, Dawn, Xu, Emily Shimeng, Thomas, Denise, Ryan, Maura E, Wong, Patrick C M, Young, Nancy M
Cochlear implants (CI) significantly improve spoken language in children with severe-to-profound sensorineural hearing loss (SNHL), yet outcomes remain more variable than in children with normal hearing. This variability cannot be reliably predicted for individual children using age at implantation or residual hearing. This study aims to compare the accuracy of traditional machine learning (ML) to deep transfer learning (DTL) algorithms to predict post-CI spoken language development of children with bilateral SNHL using a binary classification model of high versus low language improvers. A total of 278 implanted children enrolled from three centers. The accuracy, sensitivity and specificity of prediction models based upon brain neuroanatomic features using traditional ML and DTL learning. DTL prediction models using bilinear attention-based fusion strategy achieved: accuracy of 92.39% (95% CI, 90.70%-94.07%), sensitivity of 91.22% (95% CI, 89.98%-92.47%), specificity of 93.56% (95% CI, 90.91%-96.21%), and area under the curve (AUC) of 0.977 (95% CI, 0.969-0.986). DTL outperformed traditional ML models in all outcome measures. DTL was significantly improved by direct capture of discriminative and task-specific information that are advantages of representation learning enabled by this approach over ML. The results support the feasibility of a single DTL prediction model for language prediction of children served by CI programs worldwide.
- North America > United States > Illinois > Cook County > Chicago (0.06)
- Asia > China > Hong Kong (0.06)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (3 more...)
- Research Report > Strength Medium (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Technology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Health & Medicine > Therapeutic Area > Otolaryngology (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
3D Mapping Using a Lightweight and Low-Power Monocular Camera Embedded inside a Gripper of Limbed Climbing Robots
Okawara, Taku, Nishibe, Ryo, Kasano, Mao, Uno, Kentaro, Yoshida, Kazuya
Limbed climbing robots are designed to explore challenging vertical walls, such as the skylights of the Moon and Mars. In such robots, the primary role of a hand-eye camera is to accurately estimate 3D positions of graspable points (i.e., convex terrain surfaces) thanks to its close-up views. While conventional climbing robots often employ RGB-D cameras as hand-eye cameras to facilitate straightforward 3D terrain mapping and graspable point detection, RGB-D cameras are large and consume considerable power. This work presents a 3D terrain mapping system designed for space exploration using limbed climbing robots equipped with a monocular hand-eye camera. Compared to RGB-D cameras, monocular cameras are more lightweight, compact structures, and have lower power consumption. Although monocular SLAM can be used to construct 3D maps, it suffers from scale ambiguity. To address this limitation, we propose a SLAM method that fuses monocular visual constraints with limb forward kinematics. The proposed method jointly estimates time-series gripper poses and the global metric scale of the 3D map based on factor graph optimization. We validate the proposed framework through both physics-based simulations and real-world experiments. The results demonstrate that our framework constructs a metrically scaled 3D terrain map in real-time and enables autonomous grasping of convex terrain surfaces using a monocular hand-eye camera, without relying on RGB-D cameras. Our method contributes to scalable and energy-efficient perception for future space missions involving limbed climbing robots. See the video summary here: https://youtu.be/fMBrrVNKJfc
- North America > United States > Oklahoma > Payne County > Cushing (0.04)
- Asia > Japan > Kyūshū & Okinawa > Kyūshū > Kumamoto Prefecture > Kumamoto (0.04)
- Asia > Japan > Honshū > Tōhoku > Miyagi Prefecture > Sendai (0.04)
- Asia > Japan > Honshū > Kantō > Ibaraki Prefecture > Tsukuba (0.04)
Proceedings of the 2025 XCSP3 Competition
Audemard, Gilles, Lecoutre, Christophe, Lonca, Emmanuel
Competition 2025, following those published in 2022 [2], 2023 [3], and 2024 [4]. The website containing all detailed results of this international competition is available at: https://www.cril.univ-artois.fr/XCSP25 The organization of this 2025 competition involved the following tasks: adjusting general details (dates, tracks, .. . These instances can be found in this archive. Some (usually minor) differences may exist when compiling the models presented in this document and those that can be found in this archive. Remember that the complete description, Version 3.2, of the format (XCSP For the 2025 competition, 33 problems have been selected. They are succinctly presented in Table 1.1. For each problem, the type of the involved (global) constraints is indicated. At this point, do note that making a good selection of problems/instances is a difficult task. When table is followed by (), it means that starred tables are involved. It is always interesting to see how constraint solvers behave when the instances of a problem become harder and harder. This is what we call the scaling behavior of solvers.
- Europe > Austria > Styria > Graz (0.04)
- North America > United States > Kansas (0.04)
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- (42 more...)
- Information Technology > Security & Privacy (0.46)
- Leisure & Entertainment > Sports (0.45)
Sparse and nonparametric estimation of equations governing dynamical systems with applications to biology
Pillonetto, G., Giaretta, A., Aravkin, A., Bisiacco, M., Elston, T.
Data-driven discovery of model equations is a powerful approach for understanding the behavior of dynamical systems in many scientific fields. In particular, the ability to learn mathematical models from data would benefit systems biology, where the complex nature of these systems often makes a bottom up approach to modeling unfeasible. In recent years, sparse estimation techniques have gained prominence in system identification, primarily using parametric paradigms to efficiently capture system dynamics with minimal model complexity. In particular, the Sindy algorithm has successfully used sparsity to estimate nonlinear systems by extracting from a library of functions only a few key terms needed to capture the dynamics of these systems. However, parametric models often fall short in accurately representing certain nonlinearities inherent in complex systems. To address this limitation, we introduce a novel framework that integrates sparse parametric estimation with nonparametric techniques. It captures nonlinearities that Sindy cannot describe without requiring a priori information about their functional form. That is, without expanding the library of functions to include the one that is trying to be discovered. We illustrate our approach on several examples related to estimation of complex biological phenomena.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Oklahoma > Payne County > Cushing (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (2 more...)
- Health & Medicine (1.00)
- Education > Curriculum > Subject-Specific Education (0.34)